IPython is an ingenious combination of a bash-like terminal with a python shell. It can be used for both bash related affairs such as copying files around creating directories, and for actual python programming. In fact, the two can be combined to create a truly powerfull shell.
Alternatively, Jupyter provide an attractive graphical interface for performing data analysis. Or for demonstrating pylada, as in this notebook.
Pylada puts these tools to good use by providing a command-line approach to manipulate job-folders (see the relevant notebook for more information), launch actual calculations, and collect the result. When used in conjunction with python plotting libraries, e.g. matplotlib, it can provide rapid turnaround from conceptualization to result analysis.
Assuming that pylada is installed, the IPython module can be loaded in ipython/Jupyter with:
In [1]:
%load_ext pylada
Pylada's IPython interface revolves around job-folders. In order to explore its features, we first need to create job-folders, preferably some which do not involve heavy calculations. The following creates a dummy.py file in the current directory. It contains a dummy functional that does very little work. In actual runs, everything dummy should be replaced with wrappers to VASP, or Quantum Espresso.
In [2]:
%%writefile dummy.py
def functional(structure, outdir=None, value=False, **kwargs):
""" A dummy functional """
from copy import deepcopy
from pickle import dump
from random import random
from py.path import local
structure = deepcopy(structure)
structure.value = value
outdir = local(outdir)
outdir.ensure(dir=True)
dump((random(), structure, value, functional), outdir.join('OUTCAR').open('wb'))
return Extract(outdir)
def Extract(outdir=None):
""" An extraction function for a dummy functional """
from os import getcwd
from collections import namedtuple
from pickle import load
from py.path import local
if outdir == None:
outdir = local()()
Extract = namedtuple('Extract', ['success', 'directory',
'structure', 'energy', 'value', 'functional'])
outdir = local(outdir)
if not outdir.check():
return Extract(False, str(outdir), None, None, None, None)
if not outdir.join('OUTCAR').check(file=True):
return Extract(False, str(outdir), None, None, None, None)
with outdir.join('OUTCAR').open('rb') as file:
structure, energy, value, functional = load(file)
return Extract(True, outdir, energy, structure, value, functional)
functional.Extract = Extract
The notebook about creating job folders has more details about this functional. For now, let us create a jobfolder with a few jobs:
In [3]:
from dummy import functional
from pylada.jobfolder import JobFolder
from pylada.crystal.binary import zinc_blende
root = JobFolder()
structures = ['diamond', 'diamond/alloy', 'GaAs']
stuff = [0, 1, 2]
species = [('Si', 'Si'), ('Si', 'Ge'), ('Ga', 'As')]
for name, value, species in zip(structures, stuff, species):
job = root / name
job.functional = functional
job.params['value'] = value
job.params['structure'] = zinc_blende()
for atom, specie in zip(job.structure, species):
atom.type = specie
In [4]:
%mkdir -p tmp
%savefolders tmp/dummy.dict root
The next time ipython is entered, the job-folder can be loaded from disk with:
In [5]:
%explore tmp/dummy.dict
Once a folder has been explored from disk, savefolder can be called
without arguments.
The percent(%) sign indicates that these commands are ipython magic-functions. To get more information about what Pylada magic functions do, call them with "--help".
In [6]:
%explore --help
Tip: The current job-folder and the current job-folder path are stored in pylada.interactive.jobfolder and pylada.interactive.jobfolder_path. In practice, accessing those directly is rarely needed.
In [7]:
%listfolders all
This prints out the executable jobs. It can also be used to examine the content of specific subfolders.
In [8]:
%listfolders diamond/*
The syntax is the same as for the bash command-line. When given an argument
other than "all", %listfolders list only the matching subfolders, including those which are not
executable. In practice, it works like "ls -d".
Executable job-folders are those that are set to go with a functional.
In [9]:
%goto /diamond
The current job-folder is now diamond. Were there a corresponding sub-directory on disk, the current working directory would also be diamond. As it is, we have not yet launched the calculations, so no such directory exist. This feature makes it easy to navigate both job-folders and output directories simulteneously.
We can check the subfolders contained within /diamond.
In [10]:
%listfolders
And calling %goto without an argument will print out the current location (much like pwd does for directories).
In [11]:
%goto
We can also use relative paths, as well as .. to navigate around the tree structure. Most any path that works for cd will work with %goto as well.
In [12]:
%goto ..
%goto
%listfolders
In [13]:
%goto /diamond/alloy/
assert jobparams.current.functional == functional
Parameters can be accessed either throught the params dictionary:
In [14]:
jobparams.current.params.keys()
Out[14]:
Or directly as attributes of jobparams.current:
In [15]:
assert jobparams.current.value == 1
In [16]:
%goto /
jobparams.structure.name
Out[16]:
There are two things to note here:
name) of attributes (here structure) to any degree of nesting. If the parameter of a given job does not contain the nested attribute, then that job is ignored.We can set parameters much the same way:
In [17]:
jobparams.structure.name = 'hello'
jobparams.structure.name
Out[17]:
By default, it is only possible to modify existing attributes, as opposed to add new attributes.
Finally, it is possible to focus on a specific sub-set of jobfolders. By default the syntax is that of a unix-shell. However, the syntax can be switched to regular exppressions via the Pylada parameter pylada.unix_re. Only the former syntax is illustrated here:
In [18]:
jobparams['*/alloy'].structure.name
Out[18]:
Note that one only item is left in the dictionary, that item is returned directly. Indeed, there is only one job-folder which corresponds to "*/alloy". This behavior can be turned on and off using the parameters jobparams_naked_end and/or JobParams.naked_end. The unix shell-like syntax can be either absolute paths, when preceded with '/', or relative. In that last case, they are relative to the current position in the job-folder, as changed by %goto.
When the return looks like a dictionary, it behaves like a dictionary. Hence it can be iterated over:
In [19]:
for key, value in jobparams['diamond/*'].structure.name.items():
print(key, value)
In [20]:
%goto /
jobparams['diamond/alloy'].onoff = 'on'
jobparams.onoff
Out[20]:
When "off", a job-folder is ignored by jobparams (and collect, described below). Furthermore, it will not be executed. The only way to access it again is to turn it back on. Groups of calculations can be turned on and off
using the unix shell-like syntax previously.
WARNING: You should always save the job-folder after messing with it's on/off status. This is because the computations will re-read the dictionary from disk.
In [21]:
%savefolders
Once job-folders are ready, it takes all of one line to launch the calculations:
IPython
%launch scattered
This will create one pbs/slurm job per executable job-folder. A number of options are possible to select the number of processors, the account or queue, the walltime, etc. To examine them, do %launch scattered --help:
In [22]:
%launch scattered --help
Most default values should be contained in pylada.default_pbs. The number of processors is by default equal to the even number closest to the number of atoms in the structure (apparently, this is a recommended VASP default). The number of processes can be given both as an integer, or as function which takes a job-folder as the only argument, and returns an integer.
Other possibilities for lauching jobs can be obtained as follows:
In [23]:
%launch --help
In this notebook, we will be using %launch interactive since the jobs are simple and since we cannot be sure that pylada has been configured for PBS, Slurm, or other queueing systems.
In [24]:
%launch interactive
At this juncture, we should find that jobs have been created a number of output files in the directory where the file dummy.dict is located. You may remember from the start of this lesson that we loaded the dictionary with %explore /tmp/dummy.dict. The location of this file is what matters. The current working directory does not.
In [25]:
%%bash
[ ! -e tree ] || tree
You will notice that the job in alloy/diamond did not run since it is off. If you were to go back up a few cells and set it to on, and then rerun via %launch interactive, you should see that it will be computed.
We can now navigate using %goto, simultaneously through the jobfolder and the disk
In [26]:
%goto /diamond
print("current location: ", jobparams.current.name)
In [27]:
%%bash
[ ! -e tree ] || tree
In [28]:
%goto /
collect.success
Out[28]:
Our dummy functional is too simple to fail... However, if you delete any given calculation directory, and try it again, you will find some false results. Beware that some collected results are cached so they can be retrieved faster the second time around, so redoing %explore some.dict might be necessary.
Warning Success means that the calculations ran to completion. It does not mean that they are not garbage.
Results from the calculation can be retrieved in much the same way as parameters were examined. This time, however, we use an object called collect (still without preceding "%" sign). Assuming the job-folders created earlier were launched, the random energies created by our fake functional could be retrieved as in:
In [29]:
collect.energy
Out[29]:
What exactly can be collected this way will depend on the actual calculation. The easiest way to examine what's available it to hit collect.[TAB]. The collected results can be iterated over, focussed to a few relevant
calculations, exactly as was done with jobparams. The advantage is that further job-folders can be easily constructed which take the calculations a bit further. For instance, we have created job-folders which minimize spin-polarized crystal structures. Then a second-wave of job-folders would be created from the resulting relaxed crystal structures to examine different possible magnetic orders.